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Rhizobium legum inosarum bv. trifolii is a soil-inhabiting bacterium that has the capacity to be 
an effective Nj-fixing microsymbiont of Trifolium (clover) species. R. leguminosarum bv. trifolii 
strain WSM1689 is an aerobic, motile, Gram-negative, non-spore-forming rod that was isolated 
from a root nodule of Trifolium uniflorum collected on the edge of a valley 6 km from Eggares 
on the Greek Island of Naxos. Although WSM1689 is capable of highly effective N2-fixation 
with T. uniflorum, it is either unable to nodulate or unable to fix Nj with a wide range of both 
perennial and annual clovers originating from Europe, North America and Africa. WSM1689 
therefore possesses a very narrow host range for effective N2 fixation and can thus play a valua- 
ble role in determining the geographic and phenological barriers to symbiotic performance in 
the genus Trifolium. Here we describe the features of R. leguminosarum bv. trifolii strain 
WSMl 689, together with the complete genome sequence and its annotation. The 6,903, 379 bp 
genome contains 6,709 protein-coding genes and 89 RNA-only encoding genes. This multipar- 
tite genome contains six distinct replicons; a chromosome of size 4,854,518 bp and five plas- 
mids of size 667,306, 518,052, 341,391, 262,704 and 259,408 bp. This rhizobial genome is 
one of 20 sequenced as part of a DOE Joint Genome Institute 2010 Community Sequencing 
Program. 



Introduction 



The nitrogen (N) cycle is one of the most im- 
portant biogeochemical processes underpinning 
the existence of life on Earth. A key step in this 
cycle is to convert relatively inert atmospheric 
dinitrogen (N2) into a bioaccessible form such as 
ammonia [NH3) through a process referred to as 
biological nitrogen fixation (BNF). BNF is per- 
formed only by a specialized subset of Bacteria 
and Archaea that possess the necessary cellular 
machinery to enzymatically reduce N2 into NH3. 



Some of these bacteria [termed rhizobia or root 
nodule bacteria) have evolved non-obligatory 
symbiotic relationships with legumes whereby the 
bacteria receive a carbon source from the plant 
and in return supply fixed N to the host [1]. Har- 
nessing this association can boost soil N-inputs 
and therefore production yields of legumes, or 
non-legumes grown in subsequent years, without 
the need for supplementation with industrially 
synthesized N-based fertilizers [2]. 
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Some of the most widely cultivated pasture leg- 
umes are members of the legume genus Trifolium 
(clover). The natural distribution of these species 
spans three centers of diversity, with an estimat- 
ed 28% of species in the Americas, 57% in Eura- 
sia and 15% in sub-Saharan Africa [3]. Approxi- 
mately 30 species of clover, predominately of 
Eurasian origin, are widely grown as annual and 
perennial species in pasture systems in Mediter- 
ranean and temperate climatic zones [3]. Global- 
ly-important perennial species of clover include 
T. repens (white clover), T. pratense (red clover), 
T. fragiferum (strawberry clover) and T. 
hybridum (alsike clover). While clovers are 
known to form Nz-fixing symbiotic associations 
with Rhizobium leguminosarum bv. trifolii, there 
exists wide variation in symbiotic compatibility 
across different strains and hosts from ineffec- 
tive (non-Nz-fixing) nodulation to fully effective 
Nz-fixing partnerships. 

Rhizobium leguminosarum bv. trifolii strain 
WSM1689 was isolated in 1995 from a nodule of 
the perennial clover Trifolium uniflorum collected 
on the edge of a valley 6 km from Eggares on the 
Greek Island of Naxos. T. uniflorum is one of small 
number of perennial Trifolium spp. found in the 
dry, Mediterranean basin. While WSM1689 has 
been shown to be either ineffective or unable to 
nodulate a range of annual and perennial Trifoli- 
um sp., it is a highly effective Nz-fixing 
microsymbiont of T. uniflorum [4]. Therefore, R. 
leguminosarum bv. trifolii WSM1689 has a very 
narrow host range and thus represents a good 
isolate to study the genetic basis of symbiotic 



specificity. The availability of this sequence data 
also complements the already published ge- 
nomes of the clover-nodulating R. leguminosarum 
bv. trifolii WSM132S [5] and WSM2304 [6]. Here 
we present a summary classification and a set of 
general features for R. leguminosarum bv. trifolii 
strain WSM1689 together with the description of 
the complete genome sequence and its annota- 
tion. 



Classification and features 

R. leguminosarum bv. trifolii strain WSM1689 is a 
motile, non-sporulating, non-encapsulated. Gram- 
negative rod in the order Rhizobiales of the class 
Alphaproteobacteria. The rod-shaped form varies 
in size with dimensions of approximately 0.25-0.5 
|im in width and 2.0 |im in length (Figure 1 Left 
and 1 Center). It is fast growing, forming colonies 
within 3-4 days when grown on half strength Lu- 
pin Agar [VzLA] [7], try ptone-y east extract agar 
(TY) [8] or a modified yeast-mannitol agar (YMA) 
[9] at 28°C. Colonies on VzLA are opaque, slightly 
domed and moderately mucoid with smooth mar- 
gins (Figure 1 Right). Minimum Information about 
the Genome Sequence (MIGS) is provided in Table 
1. 

Figure 2 shows the phylogenetic neighborhood of 
R. leguminosarum bv. trifolii strain WSM1689 in a 
16S rRNA gene sequence based tree. This strain 
shares 100% (1362/1362 bp) sequence identity 
to the 16S rRNA gene of R. leguminosarum bv. 
trifolii strain WSM1325 [5] and R. leguminosarum 
bv. trifolii strain WSM2304 [6]. 




Figure 1. Images oi Rhizobium leguminosarum bv. trifolii stra\n WSM1689 using scanning (Left) and transmission 
(Center) electron microscopy and the appearance of colony morphology on VilA (Right). 
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Table 1. Classification and general features of Rhizobium leguminosarum bv. t/fo/;/ strain WSM1689 according 
to the MIGS recommendations [1 0,1 1 ]. 



MIGS ID 


Property 


Term 


Evidence code 






Domain Bacteria 


TAS [11] 






Phylum Proteobacteria 


TAS [12] 






Class Alphaproteobacteria 


TAS [13,14] 




Current classification 


Order Rhizobiales 
Family Rhizobiaceae 
Genus Rhizobium 

Species Rhizobium leguminosarum bv. trifolii 
Strain WSM1689 


TAS [14,15] 
TAS [16,1 7] 
TAS [16,18-21] 
TAS [16,1 8,21,22] 
TAS [4] 




Gram stain 


Negative 


IDA 




Cell shape 


Rod 


IDA 




Motility 


Motile 


IDA 




Sporulation 


Non-sporulating 


NAS 




Temperature range 


Mesophile 


NAS 




Optimum temperature 


28°C 


NAS 




Salinity 


Not reported 


NAS 


MIGS-22 


Oxygen requirement 


Aerobic 


TAS [4] 




Carbon source 


Varied 


NAS 




Energy source 


Chemoorganotroph 


NAS 


MIGS-6 


Habitat 


Soil, root nodule, host 


TAS [4] 


MIGS-15 


Biotic relationship 


Free living, symbiotic 


TAS [4] 


MIGS-14 


Pathogenicity 


Non-pathogenic 


NAS 




Biosafety level 


1 


NAS 123] 




Isolation 


Root nodule 


TAS [4] 


MIGS-4 


Geographic location 


Naxos, Greece 


IDA 


MIGS-5 


Nodule collection date 


1995 


IDA 


MIGS-4.1 


Latitude 


37.128333 


IDA 


MIGS-4.2 


Longitude 


25.443333 


IDA 


MIGS-4. 3 


Depth 


Not reported 




MIGS-4.4 


Altitude 


Not reported 





Evidence codes - IDA: Inferred from Direct Assay; TAS: Traceable Author Statement (i.e., a direct report exists in 
the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, 
but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are 
from the Gene Ontology project [24]. 
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17 



38 



82 



Rhizobium leguminosarum bv. frifo/// WSM1325 (Gc01039)* 
Rhizobium leguminosarum bv. frifo//; WSM2304 (Gc00870)* 
Rhizobium leguminosarum bv trifolii WSM2297 (Gi06477) 
Rhizobium leguminosarum bv frifo/// WSM2012 (Gi06480) 
Rhizobium leguminosarum bv trifolii WSM1689 (Gi06499) 
Rhizobium leguminosarum bv. frifo/// WSM597 (Gi06486) 
Rhizobium leguminosarum bv. v/c/ae USDA 2370"^(JQ085246, 6106483) 
Rhizobium leguminosarum bv. frifo/// CB782 (0106498) 
Rhizobium leguminosarum bv. frifo/// CC283b (0106484) 
Rhizobium leguminosarum bv frifo/// TA1 (Gi06488) 
- Rhizobium leguminosarum bv. wc/ae3841 (Gc00385)' 
r Rhizobium leguminosarum bv frifo/// CC278f (Gi06479) 
6?'^ Rhizobium leguminosarum bv i//c/ae WSM1455 (Gi06482) 

- Rhizobium p/iaseo// ATCC 14482^ (EF141340) 
Rhizobium pisiDSM 301 32^ (AY509899) 

- Rhizobium ef// USDA 9032^ (U2891 6) 
94 1 Rhizobium miluonense CCBAU 41251"^ (EF061096) 

Rhizobium lusitanum P^~7'' (AY7381 30) 
Rhizobium multihospitium CCBAU 83401 ^ (EF035074) 
70 L Rhizobium tropici CIAT899T {Gi05744 EU488752)* 
Rhizobium tubonense CCBAU 85046T{EU256434) 

J Rhizobium tibeticum CCBAU 85039T(EU256404) 

4? Rhizobium endophyticum CCGE2052T (EU867317) 

— Rhizobium indigoferae CCBAU 71042^ (AF364068) 
Rhizobium yanglingense SH22623'f (AF003375) 

Rhizobium gallicum R-602 sp^ (U86343) 

Rhizobium mongolense USDA 1 844^ (U8981 7, Gi089Q0) 

>— Rhizobium loessense CCBAU 71908^ {AF364069) 



99 J R 

7nL I 



Rhizobium sullae IS 123^ (Y10170) 

I Rhizobium alamii LMG 24466^ (AM931436) 



40 



93 Rhizobium mesosinicum CC BAU 250 1 0"^ (DQ 1 00063) 

Rhizobium oryzae Alt 505^ (EU056823) 

99 1 — Rhizobium herbae CCBAU 83011T(EU399716) 

I Rhizobium giardinii H 1 52^ (U86344) 

Rhizobium daejeonense KCTC 12121^ (AY341343) 



91 



99 



Rhizobium undicola LMG 1 1 875^ (Y1 7047) 

■ Rhizobium cellulosilyticum ALA: 0B2'^ (DQ855276) 



- Rhizobium huautlense USDA 4900^ (AF025852) 
Rhizobium alkalisoli CCBAU 01393T(EU074168) 

Rhizobium galegae HAMBI 540^ (X67226, Gi09589) 



RQ I P. 



Rhizobium vignae CCBAU 051 76^ (GU128881) 



0005 



Figure 2. Phylogenetic tree showing the relationship of Rhizobium leguminosarum bv trifolii 
WSM1689 (shown in bold prinl) to other root nodulating Rhizobium spp. in the order Rhizobiales 
based on aligned sequences of the 16S rRNAgene (1 ,180 bp internal region). All positions contain- 
ing gaps and missing data were eliminated. All sites were informative and there were no gap- 
containing sites. Phylogenetic analyses were performed using /VlEGA, version 5 [25]. The tree was 
built using the Maximum-Likelihood method with the General Time Reversible model [26]. Boot- 
strap analysis [27] with 500 replicates was performed to assess the support of the clusters. Type 
strains are indicated with a superscript T. Brackets after the strain name contain a DNA database 
accession number and/or a GOLD ID (beginning with the prefix G) for a sequencing project regis- 
tered in GOLD [28]. Published genomes are indicated with an asterisk. 
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Table 2. Compatibility of WSM1689 with both perennial and annual Trifolium genotypes for nodulation 



(Nod) and Nj-Fixation (Fix). Data compiled from [4]. 



Species Name 


Cultivar 


Origin 


Growth habit 


Nod 


Fix 


Comment 


T. uniflorum 


IN 1 1 


t uropc 


rerennidi 


INOQ 


r IX 


n ly ni y eiTecii ve 


T. tuniGns 


1 yooz D/ 


tuiope 


re re n nidi 


INOQ 


r IX 


\ neTTective 


T. tuniGns 


\ D/ 30Z 40 


tuiope 


re re n nidi 


INOQ 


Fiv 
r IX 


ineiTective 


T. medium 


Z 1 OO 1 1 J'H- 


iiurupfc; 


r c ic 1 1 1 1 Idl 


1 NOQ 


Fiv 

r IX 


1 lit;! IcCtlVc 


T. fGpens 


Uj/ /U I 


Europe 


Perennidl 


INOQ 


Fiv 
r IX 


1 n effective 


1 ra fiiQ n c 




llUIUpfc; 


Po ro n n 1 a 1 
r tr 1 1; 1 1 1 1 IdJ 




Fiv 
r IX 


1 1 itrl IcCtlVt; 


T. prsttGDSG 


Russicin no 9 


Europe 


Pe re n nidi 


INOQ 


Fiv 
r IX 


Ineffective 


/ . Ul (dicllbi^ 


i\t;UL|UI II 


Lurope 


Pci n n n 1 
r c It; 1 1 1 1 Idl 


1 noq 


Fiv 
r IX 


1 lltJi IcCtlVc 


T. 3mbiguum 


Endura 


Europe 


Perennidl 


INOQ 


Fiv 
r IX 


1 n effective 


T. cdncscens 


r L4 1 ooDo I yyy 


Europe 


Pe re n nidi 


INOQ 


Fiv 
rlX 


1 n effective 


T. frd. §ffG ru m 


r~i 9 1 9 

1 Z IZ 


til l*/^ KA /"\ 

turope 


rerenniai 


INOQ 




iNo noQuidrion 


1. uuiyiiiuiiJiiuiii 


ft71 09 
O/ 1 uz 


oouLii rMiieriLa 


rereiiiiiai 


INOQ 


Fiv 

r IX 


1 lit;! IcCtlVc 


T. longipGS 


A9 /I "^^^Ql 7 
Az 40D0 1 / 


North America 


Perennial 


INOQ 




No noduldtion 


T. subterraneum 


York 


Europe 


Annual 


Nod 


Fix 


Ineffective 


T. glandulfferum 


CP! 871 82 


Europe 


Annual 


Nod 


Fix 


Ineffective 


T. mulinerve 


87259 


Africa 


Annual 


Nod 




No noduldtion 


T. tridentatum 


CQ1263 


North America 


Annual 


Nod 


Fix 


Ineffective 



Symbiotaxonomy 

R. leguminosarum bv. trifolii WSM1689 is a highly 
effective microsymbiont of the perennial Eurasian 
clover Trifolium uniflorum (Table 2). In contrast, 
WSM1689 does not nodulate the perennial T. 
fi^agiferum and forms white ineffective (Fix ) nod- 
ules with other perennial and annual clovers of 
Eurasian origin. Moreover, WSM1689 is either 
Nod- or Fix- on clovers of North American or Afri- 
can origin. Therefore, WSM1689 is unusual in hav- 
ing an extremely narrow clover host range for the 
establishment of effective Nz-fixing symbiosis. 

Genome sequencing and annotation 

Genome project history 

This organism was selected for sequencing on the 
basis of its environmental and agricultural rele- 
vance to issues in global carbon cycling, alternative 



energy production, and biogeochemical im- 
portance, and is part of the Community Sequencing 
Program at the U.S. Department of Energy, Joint 
Genome Institute QGI) for projects of relevance to 
agency missions. The genome project is deposited 
in the Genomes OnLine Database [28] and a fin- 
ished genome sequence in IMG/GEBA. Sequencing, 
finishing and annotation were performed by the 
JGI. A summary of the project information is shown 
in Table 3. 

Growth conditions and DNA isolation 

Rhizobium leguminosarum bv. trifi)Iii strain 
WSM1689 was grown to mid logarithmic phase in 
TY rich medium on a gyratory shaker at 28°C [29]. 
DNA was isolated from 60 mL of cells using a 
CTAB (Cetyl trimethyl ammonium bromide) bac- 
terial genomic DNA isolation method [30]. 
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MIGS ID 


Property 


Term 


MlGS-31 


Finishing quality 


Finished 


MIGS-28 


Libraries used 


lllumina GAii shotgun and pained end 454 libraries 


MIGS-29 


Sequencing platforms 


lllumina GAii and 454 GS FLX Titanium technologies 


MIGS-31.2 


Sequencing coverage 


8.3x 454, 774. 6x lllumina 


MIGS-30 


Assemblers 


VELVET, version 1 .1.05; Newbler, version 2.6; phrap, version SPS -4.24 


MIGS-32 


Gene calling methods 


Prodigal 1.4, GenePRIMP 




Genbank ID 


Not yet available 




Genbank Date of Release 


Not yet released 




GOLD ID 


G 106499 




NCBl project ID 


62289 




Database: IMG-GEBA 


2 510065019 




Project relevance 


Symbiotic nitrogen fixation, agriculture 



Genome sequencing and assembly 

The genome of Rhizobium leguminosarum bv. 
trifolii strain WSM1689 was sequenced at the Joint 
Genome Institute QGI) using a combination of 
lllumina [31] and 454 technologies [32]. An 
lllumina GAii shotgun library which generated 
73,565,648 reads totaling 5,591 Mbp, and a paired 
end 454 library with an average insert size of 12 
Kbp which generated 376,185 reads totaling 93.4 
Mbp of 454 data were generated for this genome. 
All general aspects of library construction and se- 
quencing performed at the JGI can be found at [30]. 
The initial draft assembly contained 100 contigs in 
4 scaffolds. The 454 paired end data was assembled 
with Newbler, version 2.6. The Newbler consensus 
sequences were computationally shredded into 2 
Kbp overlapping fake reads (shreds). lllumina se- 
quencing data was assembled with VELVET, ver- 
sion 1.1.05 [33], and the consensus sequence com- 
putationally shredded into 1.5 Kbp overlapping 
fake reads (shreds). We integrated the 454 
Newbler consensus shreds, the lllumina VELVET 
consensus shreds and the read pairs in the 454 
paired end library using parallel phrap, version SPS 
- 4.24 (High Performance Software, LLC). The soft- 
ware Consed [34-36] was used in the following fin- 
ishing process. lllumina data was used to correct 
potential base errors and increase consensus quali- 
ty using the software Polisher developed at JGI 
(Alia Lapidus, unpublished). Possible mis- 
assemblies were corrected using gapResolution 
(Cliff Han, unpublished), Dupfinisher [37], or se- 
quencing cloned bridging PGR fragments with 
subcloning. Gaps between contigs were closed by 
editing in Consed, by PGR and by Bubble PGR {}-¥ 
Gheng, unpublished) primer walks. A total of 93 
additional reactions were necessary to close gaps 
and to raise the quality of the finished sequence. 



The total genome size is 6.9 Mbp and the final as- 
sembly is based on 57.3 Mbp of 454 draft data 
which provides an average 8.3x coverage of the 
genome and 5,345 Mbp of lllumina draft data 
which provides an average 774.6x coverage of the 
genome. 

Genome annotation 

Genes were identified using Prodigal [38] as part of 
the DOE -JGI genome annotation pipeline, followed 
by a round of manual curation using the JGI 
GenePRIMP pipeline [39]. The predicted GDSs were 
translated and used to search the National Center 
for Biotechnology Information (NCBI) 
nonredundant database, UniProt, TIGRFam, Pfam, 
PRIAM, KEGG, COG, and InterPro databases. These 
data sources were combined to assert a product 
description for each predicted protein. Non-coding 
genes and miscellaneous features were predicted 
using tRNAscan-SE [40], RNAMMer [41], Rfam [42], 
TMHMM [43], and SignalP [44]. Additional gene 
prediction analyses and functional annotation were 
performed within the Integrated Microbial Ge- 
nomes (IMG-ER) platform [45,46]. 

Genome properties 

The genome is 6,903,379 nucleotides with 60.94% 
GC content (Table 4 and Figures 3a,3b,3c,3d,3e and 
Figure 3f), and comprised of 6 replicons. From a 
total of 6,798 genes, 6,709 were protein encoding 
and 89 RNA only encoding genes. Within the ge- 
nome, 206 pseudogenes were also identified. The 
majority of genes (79.52%) were assigned a puta- 
tive function whilst the remaining genes were an- 
notated as hypothetical. The distribution of genes 
into COGs functional categories is presented in Ta- 
ble 5. 
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Attribute 


Value 


% of Total 


Genome size (bp) 


6,903,379 


100.00 


DNA coding region (bp) 


6,004,795 


86.98 


DNA G+C content (bp) 


4,206,909 


60.94 


Number of neplicons 


6 




Total genes 


6,798 


100.00 


RNA genes 


89 


1.31 


Protein<oding genes 


6,709 


98.69 


Genes with function prediction 


5,406 


79.52 


Genes assigned toCOGs 


5,400 


79.44 


Genes assigned Pfam domains 


5,618 


82.64 


Genes with signal peptides 


591 


8.69 


Genescoding transmembrane proteins 


1,524 


22.42 


CRISPR repeats 


0 






Figure 3a. Graphical circular map of Replicon WSM1689_Rleg3_Contig1814.1 of the Rhizobium 
leguminosarum bv. trifolli strain WSM1689 genome. From outside to the center: Genes on forward strand 
(color by COG categories as denoted by the IMG platform). Genes on reverse strand (color by COG catego- 
ries), RNA genes (tRNAs green, sRNAs red, other RNAs black), GC content, GC skew. 
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Figure 3b. Graphical circular map of replicon 
WSM1689_Rleg3_Contigl813.2 of the Rhizobium leguminosarum bv. 
trifolii strain WSMl 689 genome. From outside to the center: Genes on 
forward strand (color by COG categories as denoted by the IMG plat- 
form), Genes on reverse strand (color by COG categories), RNA genes 
(tRNAs green, sRNAs red, other RNAs black), GC content, GC skew. 




Figure 3c. Graphical circular map of replicon 
WSMl 689_Rleg3_Contigl 812.3 of the Rhizobium leguminosarum 
bv. t/fo/;/ strain WSMl 689 genome. From outside to the center: 
Genes on forward strand (color by COG categories as denoted by 
the IMG platform). Genes on reverse strand (color by COG cate- 
gories), RNA genes (tRNAs green, sRNAs red, other RNAs black), 
GC content, GC skew. 
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Figure 3d. Graphical circular map of replicon 
WSM1689_Rleg3_Contig1810.5 of the Rhizobium 
leguminosarum bv. trfo/// strain WSMl 689 genome. From out- 
side to the center: Genes on forward strand (color by COG cate- 
gories as denoted by the IMG platform), Genes on reveree strand 
(color by COG categories), RNA genes (tRNAs gieen, sRNAs 
red, other RNAs black), GC content, GC skew. 




Figure 3e. Graphical circular map of replicon 
WSM1689_Rleg3_Contig1811.4 of the Rhizobium 
leguminosarum bv. trifolii strain WSM1689 genome. From 
outside to the center: Genes on forward strand (color by 
COG categories as denoted by the IMG platform). Genes on 
reverse strand (colorby COG categories), RNA genes (tRNAs 
green, sRNAs red, other RNAs black), GC content, GC skew. 
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Figure 3f Graphical circular map of replicon 
WSM1689_Rleg3_Contig1 809.6 of the Rhizobium 
leguminosarum bv. frfo/// strain WSM1 689 genome. From 
outside to the center: Genes on forward strand (color by 
COG categories as denoted by the IMG platform), Genes 
on reverse strand (color by COG categories), RNA genes 
(tRNAs green, sRNAs red, other RNAs black), GC content, 
GC skew. 



Table 5. Number of protein coding genes of Rhizobium leguminosarum bv. frfo//7 strain 
WSM1 689 associated with the general COG functional categories. 



Code 


Value 


%age 


COG Category 


J 


205 


3.40 


Translation, ribosomal structure and biogenesis 


A 


0 


0.00 


RNA processing and modification 


K 


581 


9.62 


Transcription 


L 


153 


2.53 


Replication, recombination and repair 


B 


2 


0.03 


Chromatin structure and dynamics 


D 


39 


0.65 


Cell cycle control, mitosis and meiosis 


Y 


0 


0.00 


Nuclear structure 


V 


66 


1.09 


Defense mechanisms 


T 


311 


5.15 


Signal transduction mechanisms 


M 


329 


5.45 


Cell wall/membrane biogenesis 


N 


81 


1.34 


Cell motility 


Z 


0 


0.00 


Cytoskeleton 


w 


0 


0.00 


Extracellular structures 


u 


82 


1.36 


Intracellulartrafficking and secretion 


o 


187 


3.10 


Posttranslational modification, protein turnover, chaperones 


c 


311 


5.15 


Energy production conversion 
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Table 5 (cont.)- Number of protein coding genes of Rhizobium leguminosarum bv. trifolii 
strain WSM1689 associated with the general COG functional categories. 



Code Value %age COG Category 



G 


683 


11.31 


Carbohydrate transport and metabolism 


E 


629 


10.42 


Amino acid transport metabolism 


F 


105 


1.74 


Nucleotide transport and metabolism 


H 


192 


3.18 


Coenzyme transport and metabolism 


1 


222 


3.68 


Lipid transport and metabolism 


p 

r 




A Q9 
H.yZ 


morganic ion rranspori ano meiauousm 


Q 


147 


2.43 


Secondary metabolite biosynthesis, transport and catabolism 


R 


795 


13.17 


General function prediction only 


S 


620 


10.2 7 


Function unknown 




1,398 


20.56 


Not in COGS 




6,037 




Total 
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